363 research outputs found

    Modelling Users, Intentions, and Structure in Spoken Dialog

    Full text link
    We outline how utterances in dialogs can be interpreted using a partial first order logic. We exploit the capability of this logic to talk about the truth status of formulae to define a notion of coherence between utterances and explain how this coherence relation can serve for the construction of AND/OR trees that represent the segmentation of the dialog. In a BDI model we formalize basic assumptions about dialog and cooperative behaviour of participants. These assumptions provide a basis for inferring speech acts from coherence relations between utterances and attitudes of dialog participants. Speech acts prove to be useful for determining dialog segments defined on the notion of completing expectations of dialog participants. Finally, we sketch how explicit segmentation signalled by cue phrases and performatives is covered by our dialog model.Comment: 17 page

    Combining Expression and Content in Domains for Dialog Managers

    Full text link
    We present work in progress on abstracting dialog managers from their domain in order to implement a dialog manager development tool which takes (among other data) a domain description as input and delivers a new dialog manager for the described domain as output. Thereby we will focus on two topics; firstly, the construction of domain descriptions with description logics and secondly, the interpretation of utterances in a given domain.Comment: 5 pages, uses conference.st

    Semantic Processing of Out-Of-Vocabulary Words in a Spoken Dialogue System

    Full text link
    One of the most important causes of failure in spoken dialogue systems is usually neglected: the problem of words that are not covered by the system's vocabulary (out-of-vocabulary or OOV words). In this paper a methodology is described for the detection, classification and processing of OOV words in an automatic train timetable information system. The various extensions that had to be effected on the different modules of the system are reported, resulting in the design of appropriate dialogue strategies, as are encouraging evaluation results on the new versions of the word recogniser and the linguistic processor.Comment: 4 pages, 2 eps figures, requires LaTeX2e, uses eurospeech.sty and epsfi

    Rule based replication strategy for heterogeneous, autonomous information systems

    Get PDF
    Bei der regelbasierten Replikationsstrategie RegRess erfolgt die Koordination der Schreib- und Lesezugriffe auf die Replikate mittels Replikationsregeln. Diese Regeln werden in der eigens entwickelten Regelsprache RRML formuliert, wobei fachliche und technische Anforderungen berücksichtigt werden können. Vor jedem Zugriff auf die Replikate wird eine Inferenz dieser Regeln durchgeführt, um die betroffenen Replikate zu bestimmen. Dadurch wird unterschiedlichstes Konsistenzverhalten von RegRess realisiert, insbesondere werden temporäre Inkonsistenzen toleriert. Eine Regelmenge mit für einen Anwendungsfall spezifizierten Regeln bildet die Konfiguration von RegRess. Weil in den Regeln Systemzustände berücksichtigt werden können, kann zur Laufzeit das Verhalten angepasst werden. Somit handelt es sich bei RegRess um eine konfigurierbare, adaptive Replikationsstrategie. Zur Realisierung von RegRess dient der Replikationsmanager KARMA, der einen Regelinterpreter für die RRML beinhaltet.At the rule based replication strategy RegRess the coordination of the write and read accesses is carried out on the replicas by means of replication rules. These rules are formulated in the specifically developed rule language RRML, in which functional and technical requirements can be taken into account. An inference of these rules is carried out in front of every access to the replicas to determine the replicas concerned. The most different consistency behaviour is realized by recourse through this, temporary inconsistencies particularly are tolerated. An amount of rule with rules specified for an application case forms the configuration of RegRess. Because in the rules system states can be taken into account, the behaviour can be adapted to the running time. Therefore RegRess is a configurable, adaptive replication strategy. The replication manager KARMA who contains a rule interpreter for the RRML serves for the realization of RegRess

    Topic spotting using subword units

    Get PDF
    In this paper we present a new approach for topic spotting based on subword units and feature vectors instead of words. In our first approach, we only use vector quantized feature vectors and polygram language models for topic representation. In the second approach, we use phonemes instead of the vector quantized feature vectors and model the topics again using polygram language models. We trained and tested the two methods on two different corpora. The first is a part of a media corpus which contains data from TV shows for three different topics. The second is the VERBMOBIL-corpus where we used 18 dialog acts as topics. Each corpus was splitted into disjunctive test and training sets. We achieved recognition rates up to 82% for the three topics of the media corpus and up to 64% using 18 dialog acts of the VERBMOBIL-corpus as topics

    Classification of boundaries and accents in spontaneous speech

    Get PDF

    Syntactic-prosodic labeling of large spontaneous speech data-bases

    Get PDF
    In automatic speech understanding, the division of continuously running speech into syntactic chunks is a great problem. Syntactic boundaries are often marked by prosodic means. For the training of statistic models for prosodic boundaries large databases are necessary. For the German Verbmobil project (automatic speech-to-speech translation), we developed a syntactic-prosodic labeling scheme where two main types of boundaries (major syntactic boundaries and syntactically ambiguous boundaries) and some other special boundaries are labeled for a large Verbmobil spontaneous speech corpus. We compare the results of classifiers (multilayer perceptrons and language models) trained on these syntactic-prosodic boundary labels with classifiers trained on perceptual-prosodic and pure syntactic labels. The main advantage of the rough syntactic-prosodic labels presented in this paper is that large amounts of data could be labeled within a short time. Therefore, the classifiers trained with these labels turned out to be superior (recognition rates of up to 96%)

    Detection of phrase boundaries and accents

    Get PDF
    On a large speech database read by untrained speakers experiments for the recognition of phrase boundaries and phrase accents were performed. We used durational features as well as features derived from pitch and energy contours and pause information. Different sets of features were compared. For distinguishing three different boundary classes a recognition rate of 75.7% and for distinguishing accentuated from unaccentuated syllables a recognition rate of 88.7% could be achieved

    Prosodic processing and its use in Verbmobil

    Get PDF
    We present the prosody module of the VERBMOBlL speech-to-speech translation system, the world wide first complete system, which successfully uses prosodic information in the linguistic analysis. This is achieved by computing probabilities for clause boundaries, accentuation, and different types of sentence mood for each of the word hypotheses computed by the word recognizer. These probabilities guide the search of the linguistic analysis. Disambiguation is already achieved during the analysis and not by a prosodic verification of different linguistic hypotheses. So far, the most useful prosodic information is provided by clause boundaries. These are detected with a recognition rate of 94%. For the parsing of word hypotheses graphs, the use of clause boundary probabilities yields a speed-up of 92% and a 96% reduction of alternative readings

    Dialog act classification with the help of prosody

    Get PDF
    This paper presents automatic methods for the segmentation and classication of dialog acts (DA). In Verbmobil it is often sufficient to recognize the sequence of DAs occurring during a dialog between the two partners. Since a turn can consist of one or more successive DAs we conduct the classification of DAs in a two step procedure: First each turn has to be segmented into units which correspond to a DA and second the DA categories have to be identified. For the segmentation we use polygrams and multi -layer perceptrons, using prosodic features. The classification of DAs is done with semantic classication trees and polygrams
    corecore